Inter-species validation for domain combination based protein-protein interaction prediction method.

نویسندگان

  • Woo-Hyuk Jang
  • Dong-Soo Han
  • Hong-Soog Kim
  • Sung-Doke Lee
چکیده

Domain Combination based Protein-Protein Interaction Prediction (DCPPIP) method is revealed to show outstanding prediction accuracy in Yeast proteins. However, it is not yet apparent whether the method is still valid and can achieve comparable prediction accuracy for the proteins in other species. In this paper, we report the validation results of applying the DCPPIP method for Fly and Human proteins. We also report the results of inter-species validation, in which protein interaction and domain data of other species are used as learning set. 10,351 interacting protein pairs are used for the validation for Fly, 2,345 protein pairs for Human. 80% of the data are used as learning sets and 20% are reserved as test sets. High prediction accuracies (Fly: sensitivity approximately 77%, specificity approximately 92%, Human: sensitivity approximately 96%, specificity approximately 95%) are achieved in both Fly and Human cases. Interactions of proteins in Human, Mouse, H. pylori, E. coli, and C. elegans are predicted and validated using the protein interaction and domain data in Yeast, Fly, and the combination of Yeast and Fly respectively. Again, good prediction accuracy is achieved when the test protein pair has common domains with the proteins in a learning set of proteins. A notion of Domain Overlapping Rate (DOR) among species is newly developed in this paper and the correlation between DOR and prediction accuracy is examined. According to out test results, there exists fairly obvious correlation between DOR and prediction accuracy.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Discovering Domains Mediating Protein Interactions

Background: Protein-protein interactions do not provide any direct information re‌garding the domains within the proteins that mediate the interactions. The majority of proteins are multi domain proteins and the interaction between them is often defined by the pairs of their domains. Most of the former studies focus only on interacting do‌main pairs. However they do not consider the in...

متن کامل

Prediction Accuracy Evaluation of Domain and Domain Combination Based Prediction Methods for Protein-Protein Interaction

Since the proposal of domain based protein-protein interaction prediction method by [3], there are many attempts to improve domain based protein-protein interaction prediction methods [1]. Among them, domain combination based protein-protein interaction method by [2] is quite appealing in the sense that it achieves very impressive prediction accuracy in some situations. However there is no conc...

متن کامل

Domain Linker Region Knowledge Contributes to Protein-protein Interaction Prediction

Protein-protein interaction has proven to be a valuable piece of biological knowledge and a starting point for understanding the internal workings of the cell. In this paper, we propose a novel method for protein-protein interaction prediction using only the primary structural information of the protein sequence. The method is developed based on inter-domain linker region knowledge and a combin...

متن کامل

Prediction of Coffee Effects in Rats with Healthy and NAFLD Conditions Based on Protein-Protein Interaction Network Analysis

Background and objectives: Non-alcoholic fatty liver disease (NAFLD) is a common liver condition. On the other hand, coffee consumption has shown promising for gastrointestinal diseases.  Detection of the most valuable biomarkers of decaffeinated coffee treatment in healthy and non-alcoholic fatty liver disease conditions was the aim of the present study. Methods:</stro...

متن کامل

Prediction of Protein Sub-Mitochondria Locations Using Protein Interaction Networks

Background: Prediction of the protein localization is among the most important issues in the bioinformatics that is used for the prediction of the proteins in the cells and organelles such as mitochondria. In this study, several machine learning algorithms are applied for the prediction of the intracellular protein locations. These algorithms use the features extracted from pro...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Genome informatics. International Conference on Genome Informatics

دوره 16 2  شماره 

صفحات  -

تاریخ انتشار 2005